Dataset statistics
| Number of variables | 34 |
|---|---|
| Number of observations | 532614 |
| Missing cells | 2063675 |
| Missing cells (%) | 11.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 103.6 MiB |
| Average record size in memory | 204.0 B |
Variable types
| Numeric | 10 |
|---|---|
| Categorical | 21 |
| Text | 1 |
| DateTime | 1 |
| Boolean | 1 |
city_id_old has constant value "C006" | Constant |
country_id has constant value "Turkey" | Constant |
city_code has constant value "Konya" | Constant |
hierarchy3_id has a high cardinality: 77 distinct values | High cardinality |
hierarchy4_id has a high cardinality: 151 distinct values | High cardinality |
hierarchy5_id has a high cardinality: 292 distinct values | High cardinality |
Unnamed: 0 is highly overall correlated with store_id and 2 other fields | High correlation |
cluster_id is highly overall correlated with hierarchy3_id | High correlation |
hierarchy1_id is highly overall correlated with hierarchy2_id and 3 other fields | High correlation |
hierarchy2_id is highly overall correlated with hierarchy1_id and 4 other fields | High correlation |
hierarchy3_id is highly overall correlated with cluster_id and 5 other fields | High correlation |
holiday is highly overall correlated with weekday | High correlation |
month_name is highly overall correlated with promo_discount_type_2 and 1 other fields | High correlation |
price is highly overall correlated with promo_bin_2 and 2 other fields | High correlation |
promo_bin_1 is highly overall correlated with promo_bin_2 and 2 other fields | High correlation |
promo_bin_2 is highly overall correlated with hierarchy2_id and 8 other fields | High correlation |
promo_discount_2 is highly overall correlated with hierarchy1_id and 8 other fields | High correlation |
promo_discount_type_2 is highly overall correlated with hierarchy1_id and 9 other fields | High correlation |
promo_type_1 is highly overall correlated with promo_bin_2 | High correlation |
promo_type_2 is highly overall correlated with promo_bin_2 and 2 other fields | High correlation |
revenue is highly overall correlated with sales | High correlation |
sales is highly overall correlated with revenue | High correlation |
season is highly overall correlated with month_name and 4 other fields | High correlation |
store_id is highly overall correlated with Unnamed: 0 and 2 other fields | High correlation |
store_size is highly overall correlated with Unnamed: 0 and 2 other fields | High correlation |
storetype_id is highly overall correlated with Unnamed: 0 and 2 other fields | High correlation |
week is highly overall correlated with season | High correlation |
weekday is highly overall correlated with holiday | High correlation |
promo_type_1 is highly imbalanced (76.6%) | Imbalance |
promo_type_2 is highly imbalanced (99.4%) | Imbalance |
promo_bin_1 has 456834 (85.8%) missing values | Missing |
promo_bin_2 has 532135 (99.9%) missing values | Missing |
promo_discount_2 has 532135 (99.9%) missing values | Missing |
promo_discount_type_2 has 532135 (99.9%) missing values | Missing |
sales is highly skewed (γ1 = 37.03501339) | Skewed |
revenue is highly skewed (γ1 = 224.1724057) | Skewed |
Unnamed: 0 has unique values | Unique |
sales has 461233 (86.6%) zeros | Zeros |
revenue has 461290 (86.6%) zeros | Zeros |
Reproduction
| Analysis started | 2024-07-11 16:14:19.912484 |
|---|---|
| Analysis finished | 2024-07-11 16:19:37.399884 |
| Duration | 5 minutes and 17.49 seconds |
| Software version | ydata-profiling v4.8.3 |
| Download configuration | config.json |
Unnamed: 0
Real number (ℝ)
HIGH CORRELATION  UNIQUE 
| Distinct | 532614 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6388555.1 |
| Minimum | 1793963 |
|---|---|
| Maximum | 8519019 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 1793963 |
|---|---|
| 5-th percentile | 1820593.6 |
| Q1 | 5840915.2 |
| median | 5974068.5 |
| Q3 | 8385865.8 |
| 95-th percentile | 8492388.3 |
| Maximum | 8519019 |
| Range | 6725056 |
| Interquartile range (IQR) | 2544950.5 |
Descriptive statistics
| Standard deviation | 2029951 |
|---|---|
| Coefficient of variation (CV) | 0.31774806 |
| Kurtosis | 0.36080466 |
| Mean | 6388555.1 |
| Median Absolute Deviation (MAD) | 203453 |
| Skewness | -0.94877514 |
| Sum | 3.4026339 × 1012 |
| Variance | 4.1207011 × 1012 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1793963 | 1 | < 0.1% |
| 8341492 | 1 | < 0.1% |
| 8341490 | 1 | < 0.1% |
| 8341489 | 1 | < 0.1% |
| 8341488 | 1 | < 0.1% |
| 8341487 | 1 | < 0.1% |
| 8341486 | 1 | < 0.1% |
| 8341485 | 1 | < 0.1% |
| 8341484 | 1 | < 0.1% |
| 8341483 | 1 | < 0.1% |
| Other values (532604) | 532604 |
| Value | Count | Frequency (%) |
| 1793963 | 1 | |
| 1793964 | 1 | |
| 1793965 | 1 | |
| 1793966 | 1 | |
| 1793967 | 1 | |
| 1793968 | 1 | |
| 1793969 | 1 | |
| 1793970 | 1 | |
| 1793971 | 1 | |
| 1793972 | 1 |
| Value | Count | Frequency (%) |
| 8519019 | 1 | |
| 8519018 | 1 | |
| 8519017 | 1 | |
| 8519016 | 1 | |
| 8519015 | 1 | |
| 8519014 | 1 | |
| 8519013 | 1 | |
| 8519012 | 1 | |
| 8519011 | 1 | |
| 8519010 | 1 |
store_id
Categorical
HIGH CORRELATION 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| S0094 | |
|---|---|
| S0142 | |
| S0030 |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Characters and Unicode
| Total characters | 2663070 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | S0030 |
|---|---|
| 2nd row | S0030 |
| 3rd row | S0030 |
| 4th row | S0030 |
| 5th row | S0030 |
Common Values
| Value | Count | Frequency (%) |
| S0094 | 267115 | |
| S0142 | 203453 | |
| S0030 | 62046 | 11.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| s0094 | 267115 | |
| s0142 | 203453 | |
| s0030 | 62046 | 11.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 923821 | |
| S | 532614 | |
| 4 | 470568 | |
| 9 | 267115 | 10.0% |
| 1 | 203453 | 7.6% |
| 2 | 203453 | 7.6% |
| 3 | 62046 | 2.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 923821 | |
| S | 532614 | |
| 4 | 470568 | |
| 9 | 267115 | 10.0% |
| 1 | 203453 | 7.6% |
| 2 | 203453 | 7.6% |
| 3 | 62046 | 2.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 923821 | |
| S | 532614 | |
| 4 | 470568 | |
| 9 | 267115 | 10.0% |
| 1 | 203453 | 7.6% |
| 2 | 203453 | 7.6% |
| 3 | 62046 | 2.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 923821 | |
| S | 532614 | |
| 4 | 470568 | |
| 9 | 267115 | 10.0% |
| 1 | 203453 | 7.6% |
| 2 | 203453 | 7.6% |
| 3 | 62046 | 2.3% |
product_id
Text
| Distinct | 480 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Characters and Unicode
| Total characters | 2663070 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | P0015 |
|---|---|
| 2nd row | P0018 |
| 3rd row | P0035 |
| 4th row | P0051 |
| 5th row | P0055 |
| Value | Count | Frequency (%) |
| p0453 | 3004 | 0.6% |
| p0125 | 3003 | 0.6% |
| p0536 | 3002 | 0.6% |
| p0015 | 3001 | 0.6% |
| p0325 | 2993 | 0.6% |
| p0664 | 2992 | 0.6% |
| p0372 | 2984 | 0.6% |
| p0055 | 2981 | 0.6% |
| p0348 | 2980 | 0.6% |
| p0364 | 2978 | 0.6% |
| Other values (470) | 502696 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 703072 | |
| P | 532614 | |
| 1 | 200291 | 7.5% |
| 6 | 182000 | 6.8% |
| 5 | 181108 | 6.8% |
| 2 | 180972 | 6.8% |
| 4 | 177471 | 6.7% |
| 3 | 172096 | 6.5% |
| 7 | 139061 | 5.2% |
| 9 | 101551 | 3.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 703072 | |
| P | 532614 | |
| 1 | 200291 | 7.5% |
| 6 | 182000 | 6.8% |
| 5 | 181108 | 6.8% |
| 2 | 180972 | 6.8% |
| 4 | 177471 | 6.7% |
| 3 | 172096 | 6.5% |
| 7 | 139061 | 5.2% |
| 9 | 101551 | 3.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 703072 | |
| P | 532614 | |
| 1 | 200291 | 7.5% |
| 6 | 182000 | 6.8% |
| 5 | 181108 | 6.8% |
| 2 | 180972 | 6.8% |
| 4 | 177471 | 6.7% |
| 3 | 172096 | 6.5% |
| 7 | 139061 | 5.2% |
| 9 | 101551 | 3.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 703072 | |
| P | 532614 | |
| 1 | 200291 | 7.5% |
| 6 | 182000 | 6.8% |
| 5 | 181108 | 6.8% |
| 2 | 180972 | 6.8% |
| 4 | 177471 | 6.7% |
| 3 | 172096 | 6.5% |
| 7 | 139061 | 5.2% |
| 9 | 101551 | 3.8% |
date
Date
| Distinct | 1002 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| Minimum | 2017-01-02 00:00:00 |
|---|---|
| Maximum | 2019-09-30 00:00:00 |
sales
Real number (ℝ)
HIGH CORRELATION  SKEWED  ZEROS 
| Distinct | 510 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.33648231 |
| Minimum | 0 |
|---|---|
| Maximum | 301 |
| Zeros | 461233 |
| Zeros (%) | 86.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2 |
| Maximum | 301 |
| Range | 301 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.8039157 |
|---|---|
| Coefficient of variation (CV) | 5.3611011 |
| Kurtosis | 4020.8841 |
| Mean | 0.33648231 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 37.035013 |
| Sum | 179215.19 |
| Variance | 3.2541119 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 461233 | |
| 1 | 40236 | 7.6% |
| 2 | 13562 | 2.5% |
| 3 | 5585 | 1.0% |
| 4 | 3363 | 0.6% |
| 5 | 1987 | 0.4% |
| 6 | 1401 | 0.3% |
| 7 | 923 | 0.2% |
| 8 | 703 | 0.1% |
| 9 | 533 | 0.1% |
| Other values (500) | 3088 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 461233 | |
| 0.048 | 1 | < 0.1% |
| 0.064 | 3 | < 0.1% |
| 0.074 | 1 | < 0.1% |
| 0.078 | 1 | < 0.1% |
| 0.092 | 1 | < 0.1% |
| 0.096 | 2 | < 0.1% |
| 0.1 | 2 | < 0.1% |
| 0.108 | 1 | < 0.1% |
| 0.11 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 301 | 1 | |
| 300 | 1 | |
| 215 | 1 | |
| 193 | 1 | |
| 160 | 1 | |
| 156 | 1 | |
| 113 | 1 | |
| 95 | 1 | |
| 86 | 2 | |
| 82 | 1 |
revenue
Real number (ℝ)
HIGH CORRELATION  SKEWED  ZEROS 
| Distinct | 2685 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.4032907 |
| Minimum | 0 |
|---|---|
| Maximum | 5879.35 |
| Zeros | 461290 |
| Zeros (%) | 86.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 7.87 |
| Maximum | 5879.35 |
| Range | 5879.35 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 14.525375 |
|---|---|
| Coefficient of variation (CV) | 10.350939 |
| Kurtosis | 77171.231 |
| Mean | 1.4032907 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 224.17241 |
| Sum | 747412.25 |
| Variance | 210.98653 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 461290 | |
| 0.93 | 1777 | 0.3% |
| 3.24 | 1530 | 0.3% |
| 1.85 | 1364 | 0.3% |
| 2.78 | 1255 | 0.2% |
| 2.31 | 1246 | 0.2% |
| 3.7 | 925 | 0.2% |
| 1.39 | 867 | 0.2% |
| 1.16 | 835 | 0.2% |
| 6.48 | 812 | 0.2% |
| Other values (2675) | 60713 | 11.4% |
| Value | Count | Frequency (%) |
| 0 | 461290 | |
| 0.01 | 6 | < 0.1% |
| 0.02 | 3 | < 0.1% |
| 0.23 | 31 | < 0.1% |
| 0.31 | 2 | < 0.1% |
| 0.36 | 1 | < 0.1% |
| 0.42 | 119 | < 0.1% |
| 0.46 | 225 | < 0.1% |
| 0.47 | 1 | < 0.1% |
| 0.51 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 5879.35 | 1 | |
| 4874.07 | 1 | |
| 2101.94 | 1 | |
| 1968.94 | 1 | |
| 1811.57 | 1 | |
| 1596.1 | 1 | |
| 1347.5 | 1 | |
| 1303.38 | 1 | |
| 1270.34 | 1 | |
| 1256.14 | 1 |
stock
Real number (ℝ)
| Distinct | 856 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.629129 |
| Minimum | 0 |
|---|---|
| Maximum | 2700 |
| Zeros | 2219 |
| Zeros (%) | 0.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 8 |
| Q3 | 16 |
| 95-th percentile | 50 |
| Maximum | 2700 |
| Range | 2700 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 29.790146 |
|---|---|
| Coefficient of variation (CV) | 1.9060657 |
| Kurtosis | 806.67867 |
| Mean | 15.629129 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 14.67625 |
| Sum | 8324292.8 |
| Variance | 887.45277 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 40463 | 7.6% |
| 4 | 39627 | 7.4% |
| 5 | 37232 | 7.0% |
| 3 | 36774 | 6.9% |
| 2 | 32793 | 6.2% |
| 1 | 28817 | 5.4% |
| 7 | 28326 | 5.3% |
| 8 | 27492 | 5.2% |
| 12 | 24722 | 4.6% |
| 9 | 23092 | 4.3% |
| Other values (846) | 213276 |
| Value | Count | Frequency (%) |
| 0 | 2219 | |
| 0.384 | 2 | < 0.1% |
| 0.415 | 13 | < 0.1% |
| 0.464 | 1 | < 0.1% |
| 0.515 | 1 | < 0.1% |
| 0.705 | 3 | < 0.1% |
| 0.785 | 1 | < 0.1% |
| 0.832 | 3 | < 0.1% |
| 0.93 | 5 | < 0.1% |
| 0.931 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 2700 | 1 | |
| 2683 | 1 | |
| 2644 | 1 | |
| 2574 | 1 | |
| 2560 | 1 | |
| 2512 | 1 | |
| 2399 | 1 | |
| 1253 | 1 | |
| 1175 | 1 | |
| 1080 | 1 |
price
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 443 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 1012 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16.33836 |
| Minimum | 0.01 |
|---|---|
| Maximum | 1599 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3.5 |
| median | 8.65 |
| Q3 | 17.99 |
| 95-th percentile | 59.9 |
| Maximum | 1599 |
| Range | 1598.99 |
| Interquartile range (IQR) | 14.49 |
Descriptive statistics
| Standard deviation | 31.232437 |
|---|---|
| Coefficient of variation (CV) | 1.9116017 |
| Kurtosis | 561.88314 |
| Mean | 16.33836 |
| Median Absolute Deviation (MAD) | 6.05 |
| Skewness | 16.528175 |
| Sum | 8685505.1 |
| Variance | 975.46512 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 12394 | 2.3% |
| 3.5 | 10446 | 2.0% |
| 3.95 | 8869 | 1.7% |
| 19.9 | 8684 | 1.6% |
| 11.9 | 8324 | 1.6% |
| 0.75 | 7556 | 1.4% |
| 2.95 | 7296 | 1.4% |
| 29.9 | 7156 | 1.3% |
| 12.9 | 7099 | 1.3% |
| 4.9 | 6908 | 1.3% |
| Other values (433) | 446870 |
| Value | Count | Frequency (%) |
| 0.01 | 13 | < 0.1% |
| 0.25 | 466 | 0.1% |
| 0.3 | 6 | < 0.1% |
| 0.35 | 4 | < 0.1% |
| 0.4 | 6 | < 0.1% |
| 0.45 | 664 | 0.1% |
| 0.5 | 3240 | |
| 0.58 | 26 | < 0.1% |
| 0.6 | 1471 | |
| 0.65 | 3059 |
| Value | Count | Frequency (%) |
| 1599 | 15 | < 0.1% |
| 1549 | 1 | < 0.1% |
| 1499 | 2 | < 0.1% |
| 1449 | 3 | < 0.1% |
| 1399 | 13 | < 0.1% |
| 1349 | 18 | < 0.1% |
| 699 | 185 | |
| 679 | 22 | < 0.1% |
| 655 | 24 | < 0.1% |
| 599 | 28 | < 0.1% |
promo_type_1
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 520.9 KiB |
| PR14 | |
|---|---|
| PR05 | 35490 |
| PR10 | 12782 |
| PR03 | 9071 |
| PR06 | 7590 |
| Other values (11) | 10847 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2130456 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PR14 |
|---|---|
| 2nd row | PR14 |
| 3rd row | PR14 |
| 4th row | PR14 |
| 5th row | PR05 |
Common Values
| Value | Count | Frequency (%) |
| PR14 | 456834 | |
| PR05 | 35490 | 6.7% |
| PR10 | 12782 | 2.4% |
| PR03 | 9071 | 1.7% |
| PR06 | 7590 | 1.4% |
| PR07 | 3111 | 0.6% |
| PR12 | 2353 | 0.4% |
| PR09 | 1963 | 0.4% |
| PR17 | 1890 | 0.4% |
| PR01 | 654 | 0.1% |
| Other values (6) | 876 | 0.2% |
Length
| Value | Count | Frequency (%) |
| pr14 | 456834 | |
| pr05 | 35490 | 6.7% |
| pr10 | 12782 | 2.4% |
| pr03 | 9071 | 1.7% |
| pr06 | 7590 | 1.4% |
| pr07 | 3111 | 0.6% |
| pr12 | 2353 | 0.4% |
| pr09 | 1963 | 0.4% |
| pr17 | 1890 | 0.4% |
| pr01 | 654 | 0.1% |
| Other values (6) | 876 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| P | 532614 | |
| R | 532614 | |
| 1 | 475201 | |
| 4 | 457023 | |
| 0 | 71105 | 3.3% |
| 5 | 35490 | 1.7% |
| 3 | 9101 | 0.4% |
| 6 | 7638 | 0.4% |
| 7 | 5001 | 0.2% |
| 2 | 2353 | 0.1% |
| Other values (2) | 2316 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| P | 532614 | |
| R | 532614 | |
| 1 | 475201 | |
| 4 | 457023 | |
| 0 | 71105 | 3.3% |
| 5 | 35490 | 1.7% |
| 3 | 9101 | 0.4% |
| 6 | 7638 | 0.4% |
| 7 | 5001 | 0.2% |
| 2 | 2353 | 0.1% |
| Other values (2) | 2316 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| P | 532614 | |
| R | 532614 | |
| 1 | 475201 | |
| 4 | 457023 | |
| 0 | 71105 | 3.3% |
| 5 | 35490 | 1.7% |
| 3 | 9101 | 0.4% |
| 6 | 7638 | 0.4% |
| 7 | 5001 | 0.2% |
| 2 | 2353 | 0.1% |
| Other values (2) | 2316 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| P | 532614 | |
| R | 532614 | |
| 1 | 475201 | |
| 4 | 457023 | |
| 0 | 71105 | 3.3% |
| 5 | 35490 | 1.7% |
| 3 | 9101 | 0.4% |
| 6 | 7638 | 0.4% |
| 7 | 5001 | 0.2% |
| 2 | 2353 | 0.1% |
| Other values (2) | 2316 | 0.1% |
promo_bin_1
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 456834 |
| Missing (%) | 85.8% |
| Memory size | 520.5 KiB |
| verylow | |
|---|---|
| low | |
| moderate | |
| high | |
| veryhigh |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 6.0624835 |
| Min length | 3 |
Characters and Unicode
| Total characters | 459415 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | verylow |
|---|---|
| 2nd row | verylow |
| 3rd row | verylow |
| 4th row | low |
| 5th row | moderate |
Common Values
| Value | Count | Frequency (%) |
| verylow | 30355 | 5.7% |
| low | 15786 | 3.0% |
| moderate | 12351 | 2.3% |
| high | 9385 | 1.8% |
| veryhigh | 7903 | 1.5% |
| (Missing) | 456834 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| verylow | 30355 | |
| low | 15786 | |
| moderate | 12351 | |
| high | 9385 | 12.4% |
| veryhigh | 7903 | 10.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 62960 | |
| o | 58492 | |
| r | 50609 | |
| l | 46141 | |
| w | 46141 | |
| v | 38258 | |
| y | 38258 | |
| h | 34576 | |
| i | 17288 | 3.8% |
| g | 17288 | 3.8% |
| Other values (4) | 49404 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 459415 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 62960 | |
| o | 58492 | |
| r | 50609 | |
| l | 46141 | |
| w | 46141 | |
| v | 38258 | |
| y | 38258 | |
| h | 34576 | |
| i | 17288 | 3.8% |
| g | 17288 | 3.8% |
| Other values (4) | 49404 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 459415 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 62960 | |
| o | 58492 | |
| r | 50609 | |
| l | 46141 | |
| w | 46141 | |
| v | 38258 | |
| y | 38258 | |
| h | 34576 | |
| i | 17288 | 3.8% |
| g | 17288 | 3.8% |
| Other values (4) | 49404 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 459415 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 62960 | |
| o | 58492 | |
| r | 50609 | |
| l | 46141 | |
| w | 46141 | |
| v | 38258 | |
| y | 38258 | |
| h | 34576 | |
| i | 17288 | 3.8% |
| g | 17288 | 3.8% |
| Other values (4) | 49404 |
promo_type_2
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 520.5 KiB |
| PR03 | |
|---|---|
| PR02 | 354 |
| PR01 | 121 |
| PR04 | 4 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2130456 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PR03 |
|---|---|
| 2nd row | PR03 |
| 3rd row | PR03 |
| 4th row | PR03 |
| 5th row | PR03 |
Common Values
| Value | Count | Frequency (%) |
| PR03 | 532135 | |
| PR02 | 354 | 0.1% |
| PR01 | 121 | < 0.1% |
| PR04 | 4 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| pr03 | 532135 | |
| pr02 | 354 | 0.1% |
| pr01 | 121 | < 0.1% |
| pr04 | 4 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| P | 532614 | |
| R | 532614 | |
| 0 | 532614 | |
| 3 | 532135 | |
| 2 | 354 | < 0.1% |
| 1 | 121 | < 0.1% |
| 4 | 4 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| P | 532614 | |
| R | 532614 | |
| 0 | 532614 | |
| 3 | 532135 | |
| 2 | 354 | < 0.1% |
| 1 | 121 | < 0.1% |
| 4 | 4 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| P | 532614 | |
| R | 532614 | |
| 0 | 532614 | |
| 3 | 532135 | |
| 2 | 354 | < 0.1% |
| 1 | 121 | < 0.1% |
| 4 | 4 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| P | 532614 | |
| R | 532614 | |
| 0 | 532614 | |
| 3 | 532135 | |
| 2 | 354 | < 0.1% |
| 1 | 121 | < 0.1% |
| 4 | 4 | < 0.1% |
promo_bin_2
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 532135 |
| Missing (%) | 99.9% |
| Memory size | 520.4 KiB |
| verylow | |
|---|---|
| veryhigh | |
| high | 31 |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 7.0605428 |
| Min length | 4 |
Characters and Unicode
| Total characters | 3382 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | verylow |
|---|---|
| 2nd row | verylow |
| 3rd row | verylow |
| 4th row | verylow |
| 5th row | verylow |
Common Values
| Value | Count | Frequency (%) |
| verylow | 326 | 0.1% |
| veryhigh | 122 | < 0.1% |
| high | 31 | < 0.1% |
| (Missing) | 532135 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| verylow | 326 | |
| veryhigh | 122 | 25.5% |
| high | 31 | 6.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| v | 448 | |
| e | 448 | |
| r | 448 | |
| y | 448 | |
| l | 326 | |
| o | 326 | |
| w | 326 | |
| h | 306 | |
| i | 153 | 4.5% |
| g | 153 | 4.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3382 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| v | 448 | |
| e | 448 | |
| r | 448 | |
| y | 448 | |
| l | 326 | |
| o | 326 | |
| w | 326 | |
| h | 306 | |
| i | 153 | 4.5% |
| g | 153 | 4.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3382 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| v | 448 | |
| e | 448 | |
| r | 448 | |
| y | 448 | |
| l | 326 | |
| o | 326 | |
| w | 326 | |
| h | 306 | |
| i | 153 | 4.5% |
| g | 153 | 4.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3382 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| v | 448 | |
| e | 448 | |
| r | 448 | |
| y | 448 | |
| l | 326 | |
| o | 326 | |
| w | 326 | |
| h | 306 | |
| i | 153 | 4.5% |
| g | 153 | 4.5% |
promo_discount_2
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 5 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 532135 |
| Missing (%) | 99.9% |
| Memory size | 4.1 MiB |
| 20.0 | |
|---|---|
| 50.0 | |
| 35.0 | 28 |
| 16.0 | 18 |
| 40.0 | 3 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1916 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 20.0 |
|---|---|
| 2nd row | 20.0 |
| 3rd row | 20.0 |
| 4th row | 20.0 |
| 5th row | 20.0 |
Common Values
| Value | Count | Frequency (%) |
| 20.0 | 308 | 0.1% |
| 50.0 | 122 | < 0.1% |
| 35.0 | 28 | < 0.1% |
| 16.0 | 18 | < 0.1% |
| 40.0 | 3 | < 0.1% |
| (Missing) | 532135 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 20.0 | 308 | |
| 50.0 | 122 | 25.5% |
| 35.0 | 28 | 5.8% |
| 16.0 | 18 | 3.8% |
| 40.0 | 3 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 912 | |
| . | 479 | |
| 2 | 308 | 16.1% |
| 5 | 150 | 7.8% |
| 3 | 28 | 1.5% |
| 1 | 18 | 0.9% |
| 6 | 18 | 0.9% |
| 4 | 3 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1916 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 912 | |
| . | 479 | |
| 2 | 308 | 16.1% |
| 5 | 150 | 7.8% |
| 3 | 28 | 1.5% |
| 1 | 18 | 0.9% |
| 6 | 18 | 0.9% |
| 4 | 3 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1916 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 912 | |
| . | 479 | |
| 2 | 308 | 16.1% |
| 5 | 150 | 7.8% |
| 3 | 28 | 1.5% |
| 1 | 18 | 0.9% |
| 6 | 18 | 0.9% |
| 4 | 3 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1916 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 912 | |
| . | 479 | |
| 2 | 308 | 16.1% |
| 5 | 150 | 7.8% |
| 3 | 28 | 1.5% |
| 1 | 18 | 0.9% |
| 6 | 18 | 0.9% |
| 4 | 3 | 0.2% |
promo_discount_type_2
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 532135 |
| Missing (%) | 99.9% |
| Memory size | 520.5 KiB |
| PR02 | |
|---|---|
| PR04 | |
| PR03 | |
| PR01 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1916 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PR04 |
|---|---|
| 2nd row | PR04 |
| 3rd row | PR04 |
| 4th row | PR04 |
| 5th row | PR04 |
Common Values
| Value | Count | Frequency (%) |
| PR02 | 185 | < 0.1% |
| PR04 | 141 | < 0.1% |
| PR03 | 109 | < 0.1% |
| PR01 | 44 | < 0.1% |
| (Missing) | 532135 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| pr02 | 185 | |
| pr04 | 141 | |
| pr03 | 109 | |
| pr01 | 44 | 9.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| P | 479 | |
| R | 479 | |
| 0 | 479 | |
| 2 | 185 | 9.7% |
| 4 | 141 | 7.4% |
| 3 | 109 | 5.7% |
| 1 | 44 | 2.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1916 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| P | 479 | |
| R | 479 | |
| 0 | 479 | |
| 2 | 185 | 9.7% |
| 4 | 141 | 7.4% |
| 3 | 109 | 5.7% |
| 1 | 44 | 2.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1916 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| P | 479 | |
| R | 479 | |
| 0 | 479 | |
| 2 | 185 | 9.7% |
| 4 | 141 | 7.4% |
| 3 | 109 | 5.7% |
| 1 | 44 | 2.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1916 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| P | 479 | |
| R | 479 | |
| 0 | 479 | |
| 2 | 185 | 9.7% |
| 4 | 141 | 7.4% |
| 3 | 109 | 5.7% |
| 1 | 44 | 2.3% |
product_length
Real number (ℝ)
| Distinct | 108 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3158 |
| Missing (%) | 0.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.5208278 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 598 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2.8 |
| median | 5 |
| Q3 | 7.5 |
| 95-th percentile | 20 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 4.7 |
Descriptive statistics
| Standard deviation | 6.6375649 |
|---|---|
| Coefficient of variation (CV) | 1.0179022 |
| Kurtosis | 46.2261 |
| Mean | 6.5208278 |
| Median Absolute Deviation (MAD) | 2.5 |
| Skewness | 4.8038105 |
| Sum | 3452491.4 |
| Variance | 44.057267 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 52699 | 9.9% |
| 1 | 38538 | 7.2% |
| 5 | 37524 | 7.0% |
| 6 | 32373 | 6.1% |
| 4.5 | 32087 | 6.0% |
| 3 | 24588 | 4.6% |
| 4 | 22580 | 4.2% |
| 7 | 17071 | 3.2% |
| 6.5 | 12662 | 2.4% |
| 7.5 | 11537 | 2.2% |
| Other values (98) | 247797 |
| Value | Count | Frequency (%) |
| 0 | 598 | 0.1% |
| 0.3 | 376 | 0.1% |
| 0.5 | 2914 | 0.5% |
| 1 | 38538 | |
| 1.5 | 8186 | 1.5% |
| 1.6 | 2570 | 0.5% |
| 1.7 | 5622 | 1.1% |
| 1.8 | 2179 | 0.4% |
| 2 | 52699 | |
| 2.1 | 524 | 0.1% |
| Value | Count | Frequency (%) |
| 100 | 545 | 0.1% |
| 59 | 215 | < 0.1% |
| 47.8 | 678 | 0.1% |
| 44 | 72 | < 0.1% |
| 40.6 | 258 | < 0.1% |
| 40 | 1262 | |
| 33 | 300 | 0.1% |
| 30 | 2063 | |
| 29.3 | 162 | < 0.1% |
| 28 | 1918 |
product_depth
Real number (ℝ)
| Distinct | 139 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3133 |
| Missing (%) | 0.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.636908 |
| Minimum | 0 |
|---|---|
| Maximum | 160 |
| Zeros | 598 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 11.1 |
| median | 17 |
| Q3 | 22.5 |
| 95-th percentile | 33 |
| Maximum | 160 |
| Range | 160 |
| Interquartile range (IQR) | 11.4 |
Descriptive statistics
| Standard deviation | 11.421757 |
|---|---|
| Coefficient of variation (CV) | 0.64760542 |
| Kurtosis | 33.694102 |
| Mean | 17.636908 |
| Median Absolute Deviation (MAD) | 5.9 |
| Skewness | 3.8725702 |
| Sum | 9338407.5 |
| Variance | 130.45653 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 17 | 19431 | 3.6% |
| 25 | 17363 | 3.3% |
| 20 | 16765 | 3.1% |
| 15 | 16733 | 3.1% |
| 12 | 16475 | 3.1% |
| 18 | 15680 | 2.9% |
| 16 | 15285 | 2.9% |
| 4 | 15065 | 2.8% |
| 23 | 14713 | 2.8% |
| 14 | 14259 | 2.7% |
| Other values (129) | 367712 |
| Value | Count | Frequency (%) |
| 0 | 598 | 0.1% |
| 1 | 1089 | 0.2% |
| 1.5 | 503 | 0.1% |
| 2 | 140 | < 0.1% |
| 3 | 8967 | |
| 3.5 | 343 | 0.1% |
| 3.8 | 1642 | 0.3% |
| 4 | 15065 | |
| 4.5 | 6848 | |
| 4.8 | 202 | < 0.1% |
| Value | Count | Frequency (%) |
| 160 | 545 | 0.1% |
| 100 | 1262 | |
| 88 | 287 | 0.1% |
| 80 | 678 | 0.1% |
| 77 | 1409 | |
| 56 | 1714 | |
| 55 | 491 | 0.1% |
| 48 | 1776 | |
| 47 | 625 | 0.1% |
| 45.7 | 1348 |
product_width
Real number (ℝ)
| Distinct | 123 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3133 |
| Missing (%) | 0.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.352003 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 598 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 7.5 |
| median | 10 |
| Q3 | 15 |
| 95-th percentile | 30 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 7.5 |
Descriptive statistics
| Standard deviation | 8.1855113 |
|---|---|
| Coefficient of variation (CV) | 0.66268697 |
| Kurtosis | 16.339503 |
| Mean | 12.352003 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 2.7833404 |
| Sum | 6540150.8 |
| Variance | 67.002596 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10 | 37446 | 7.0% |
| 9 | 31888 | 6.0% |
| 8 | 24807 | 4.7% |
| 13 | 24092 | 4.5% |
| 11 | 20271 | 3.8% |
| 7 | 19307 | 3.6% |
| 12 | 17634 | 3.3% |
| 16 | 15083 | 2.8% |
| 4.5 | 14395 | 2.7% |
| 7.5 | 14050 | 2.6% |
| Other values (113) | 310508 |
| Value | Count | Frequency (%) |
| 0 | 598 | 0.1% |
| 1 | 1783 | 0.3% |
| 2 | 6009 | |
| 2.5 | 2984 | |
| 3 | 6546 | |
| 3.3 | 1002 | 0.2% |
| 3.5 | 771 | 0.1% |
| 3.6 | 1827 | 0.3% |
| 3.8 | 718 | 0.1% |
| 4 | 5184 |
| Value | Count | Frequency (%) |
| 100 | 545 | 0.1% |
| 59 | 215 | < 0.1% |
| 50 | 1262 | |
| 47.8 | 678 | 0.1% |
| 47 | 72 | < 0.1% |
| 44.5 | 27 | < 0.1% |
| 44.4 | 2899 | |
| 40.5 | 563 | 0.1% |
| 40 | 758 | 0.1% |
| 38 | 1259 |
cluster_id
Categorical
HIGH CORRELATION 
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| cluster_0 | |
|---|---|
| cluster_9 | |
| cluster_4 | |
| cluster_3 | |
| cluster_6 | 24874 |
| Other values (5) |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 9 |
| Min length | 9 |
Characters and Unicode
| Total characters | 4793526 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | cluster_1 |
|---|---|
| 2nd row | cluster_4 |
| 3rd row | cluster_7 |
| 4th row | cluster_7 |
| 5th row | cluster_0 |
Common Values
| Value | Count | Frequency (%) |
| cluster_0 | 311482 | |
| cluster_9 | 44791 | 8.4% |
| cluster_4 | 36893 | 6.9% |
| cluster_3 | 35911 | 6.7% |
| cluster_6 | 24874 | 4.7% |
| cluster_8 | 22275 | 4.2% |
| cluster_7 | 16694 | 3.1% |
| cluster_5 | 15773 | 3.0% |
| cluster_2 | 13890 | 2.6% |
| cluster_1 | 10031 | 1.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| cluster_0 | 311482 | |
| cluster_9 | 44791 | 8.4% |
| cluster_4 | 36893 | 6.9% |
| cluster_3 | 35911 | 6.7% |
| cluster_6 | 24874 | 4.7% |
| cluster_8 | 22275 | 4.2% |
| cluster_7 | 16694 | 3.1% |
| cluster_5 | 15773 | 3.0% |
| cluster_2 | 13890 | 2.6% |
| cluster_1 | 10031 | 1.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| c | 532614 | |
| l | 532614 | |
| u | 532614 | |
| s | 532614 | |
| t | 532614 | |
| e | 532614 | |
| r | 532614 | |
| _ | 532614 | |
| 0 | 311482 | |
| 9 | 44791 | 0.9% |
| Other values (8) | 176341 | 3.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 4793526 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| c | 532614 | |
| l | 532614 | |
| u | 532614 | |
| s | 532614 | |
| t | 532614 | |
| e | 532614 | |
| r | 532614 | |
| _ | 532614 | |
| 0 | 311482 | |
| 9 | 44791 | 0.9% |
| Other values (8) | 176341 | 3.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 4793526 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| c | 532614 | |
| l | 532614 | |
| u | 532614 | |
| s | 532614 | |
| t | 532614 | |
| e | 532614 | |
| r | 532614 | |
| _ | 532614 | |
| 0 | 311482 | |
| 9 | 44791 | 0.9% |
| Other values (8) | 176341 | 3.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 4793526 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| c | 532614 | |
| l | 532614 | |
| u | 532614 | |
| s | 532614 | |
| t | 532614 | |
| e | 532614 | |
| r | 532614 | |
| _ | 532614 | |
| 0 | 311482 | |
| 9 | 44791 | 0.9% |
| Other values (8) | 176341 | 3.7% |
hierarchy1_id
Categorical
HIGH CORRELATION 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 520.5 KiB |
| H00 | |
|---|---|
| H01 | |
| H03 | |
| H02 | 973 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 1597842 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | H00 |
|---|---|
| 2nd row | H00 |
| 3rd row | H00 |
| 4th row | H00 |
| 5th row | H00 |
Common Values
| Value | Count | Frequency (%) |
| H00 | 224943 | |
| H01 | 165294 | |
| H03 | 141404 | |
| H02 | 973 | 0.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| h00 | 224943 | |
| h01 | 165294 | |
| h03 | 141404 | |
| h02 | 973 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 757557 | |
| H | 532614 | |
| 1 | 165294 | 10.3% |
| 3 | 141404 | 8.8% |
| 2 | 973 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1597842 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 757557 | |
| H | 532614 | |
| 1 | 165294 | 10.3% |
| 3 | 141404 | 8.8% |
| 2 | 973 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1597842 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 757557 | |
| H | 532614 | |
| 1 | 165294 | 10.3% |
| 3 | 141404 | 8.8% |
| 2 | 973 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1597842 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 757557 | |
| H | 532614 | |
| 1 | 165294 | 10.3% |
| 3 | 141404 | 8.8% |
| 2 | 973 | 0.1% |
hierarchy2_id
Categorical
HIGH CORRELATION 
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 520.9 KiB |
| H0108 | |
|---|---|
| H0003 | |
| H0002 | |
| H0313 | |
| H0000 | |
| Other values (13) |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Characters and Unicode
| Total characters | 2663070 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | H0000 |
|---|---|
| 2nd row | H0003 |
| 3rd row | H0003 |
| 4th row | H0003 |
| 5th row | H0003 |
Common Values
| Value | Count | Frequency (%) |
| H0108 | 84549 | |
| H0003 | 77588 | |
| H0002 | 55668 | |
| H0313 | 53774 | |
| H0000 | 37446 | |
| H0312 | 36916 | |
| H0001 | 35292 | |
| H0106 | 34969 | |
| H0107 | 29867 | 5.6% |
| H0314 | 19706 | 3.7% |
| Other values (8) | 66839 |
Length
| Value | Count | Frequency (%) |
| h0108 | 84549 | |
| h0003 | 77588 | |
| h0002 | 55668 | |
| h0313 | 53774 | |
| h0000 | 37446 | |
| h0312 | 36916 | |
| h0001 | 35292 | |
| h0106 | 34969 | |
| h0107 | 29867 | 5.6% |
| h0314 | 19706 | 3.7% |
| Other values (8) | 66839 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1186213 | |
| H | 532614 | |
| 1 | 355666 | 13.4% |
| 3 | 272766 | 10.2% |
| 2 | 93557 | 3.5% |
| 8 | 84549 | 3.2% |
| 4 | 38655 | 1.5% |
| 6 | 36552 | 1.4% |
| 7 | 34513 | 1.3% |
| 5 | 27684 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1186213 | |
| H | 532614 | |
| 1 | 355666 | 13.4% |
| 3 | 272766 | 10.2% |
| 2 | 93557 | 3.5% |
| 8 | 84549 | 3.2% |
| 4 | 38655 | 1.5% |
| 6 | 36552 | 1.4% |
| 7 | 34513 | 1.3% |
| 5 | 27684 | 1.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1186213 | |
| H | 532614 | |
| 1 | 355666 | 13.4% |
| 3 | 272766 | 10.2% |
| 2 | 93557 | 3.5% |
| 8 | 84549 | 3.2% |
| 4 | 38655 | 1.5% |
| 6 | 36552 | 1.4% |
| 7 | 34513 | 1.3% |
| 5 | 27684 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1186213 | |
| H | 532614 | |
| 1 | 355666 | 13.4% |
| 3 | 272766 | 10.2% |
| 2 | 93557 | 3.5% |
| 8 | 84549 | 3.2% |
| 4 | 38655 | 1.5% |
| 6 | 36552 | 1.4% |
| 7 | 34513 | 1.3% |
| 5 | 27684 | 1.0% |
hierarchy3_id
Categorical
HIGH CARDINALITY  HIGH CORRELATION 
| Distinct | 77 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 522.9 KiB |
| H000312 | 31981 |
|---|---|
| H010601 | 26643 |
| H010807 | 24152 |
| H000004 | 22336 |
| H000200 | 20249 |
| Other values (72) |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Characters and Unicode
| Total characters | 3728298 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | H000003 |
|---|---|
| 2nd row | H000316 |
| 3rd row | H000311 |
| 4th row | H000314 |
| 5th row | H000311 |
Common Values
| Value | Count | Frequency (%) |
| H000312 | 31981 | 6.0% |
| H010601 | 26643 | 5.0% |
| H010807 | 24152 | 4.5% |
| H000004 | 22336 | 4.2% |
| H000200 | 20249 | 3.8% |
| H010805 | 19395 | 3.6% |
| H031302 | 16777 | 3.1% |
| H000316 | 16696 | 3.1% |
| H000201 | 16464 | 3.1% |
| H000102 | 14183 | 2.7% |
| Other values (67) | 323738 |
Length
| Value | Count | Frequency (%) |
| h000312 | 31981 | 6.0% |
| h010601 | 26643 | 5.0% |
| h010807 | 24152 | 4.5% |
| h000004 | 22336 | 4.2% |
| h000200 | 20249 | 3.8% |
| h010805 | 19395 | 3.6% |
| h031302 | 16777 | 3.1% |
| h000316 | 16696 | 3.1% |
| h000201 | 16464 | 3.1% |
| h000102 | 14183 | 2.7% |
| Other values (67) | 323738 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1679772 | |
| 1 | 574386 | 15.4% |
| H | 532614 | 14.3% |
| 3 | 306925 | 8.2% |
| 2 | 182293 | 4.9% |
| 8 | 106394 | 2.9% |
| 5 | 91953 | 2.5% |
| 4 | 81151 | 2.2% |
| 7 | 79688 | 2.1% |
| 6 | 73400 | 2.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3728298 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1679772 | |
| 1 | 574386 | 15.4% |
| H | 532614 | 14.3% |
| 3 | 306925 | 8.2% |
| 2 | 182293 | 4.9% |
| 8 | 106394 | 2.9% |
| 5 | 91953 | 2.5% |
| 4 | 81151 | 2.2% |
| 7 | 79688 | 2.1% |
| 6 | 73400 | 2.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3728298 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1679772 | |
| 1 | 574386 | 15.4% |
| H | 532614 | 14.3% |
| 3 | 306925 | 8.2% |
| 2 | 182293 | 4.9% |
| 8 | 106394 | 2.9% |
| 5 | 91953 | 2.5% |
| 4 | 81151 | 2.2% |
| 7 | 79688 | 2.1% |
| 6 | 73400 | 2.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3728298 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1679772 | |
| 1 | 574386 | 15.4% |
| H | 532614 | 14.3% |
| 3 | 306925 | 8.2% |
| 2 | 182293 | 4.9% |
| 8 | 106394 | 2.9% |
| 5 | 91953 | 2.5% |
| 4 | 81151 | 2.2% |
| 7 | 79688 | 2.1% |
| 6 | 73400 | 2.0% |
hierarchy4_id
Categorical
HIGH CARDINALITY 
| Distinct | 151 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.0 MiB |
| H00031200 | 25007 |
|---|---|
| H01080500 | 14967 |
| H00010210 | 14183 |
| H00000405 | 13846 |
| H01060113 | 13047 |
| Other values (146) |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 9 |
| Min length | 9 |
Characters and Unicode
| Total characters | 4793526 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | H00000309 |
|---|---|
| 2nd row | H00031609 |
| 3rd row | H00031100 |
| 4th row | H00031409 |
| 5th row | H00031109 |
Common Values
| Value | Count | Frequency (%) |
| H00031200 | 25007 | 4.7% |
| H01080500 | 14967 | 2.8% |
| H00010210 | 14183 | 2.7% |
| H00000405 | 13846 | 2.6% |
| H01060113 | 13047 | 2.4% |
| H00020000 | 12368 | 2.3% |
| H00031609 | 11845 | 2.2% |
| H01080900 | 8911 | 1.7% |
| H03130700 | 8843 | 1.7% |
| H01080709 | 8826 | 1.7% |
| Other values (141) | 400771 |
Length
| Value | Count | Frequency (%) |
| h00031200 | 25007 | 4.7% |
| h01080500 | 14967 | 2.8% |
| h00010210 | 14183 | 2.7% |
| h00000405 | 13846 | 2.6% |
| h01060113 | 13047 | 2.4% |
| h00020000 | 12368 | 2.3% |
| h00031609 | 11845 | 2.2% |
| h01080900 | 8911 | 1.7% |
| h03130700 | 8843 | 1.7% |
| h01080709 | 8826 | 1.7% |
| Other values (141) | 400771 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2332131 | |
| 1 | 744777 | 15.5% |
| H | 532614 | 11.1% |
| 3 | 338064 | 7.1% |
| 2 | 209978 | 4.4% |
| 5 | 159172 | 3.3% |
| 8 | 115013 | 2.4% |
| 9 | 105322 | 2.2% |
| 4 | 94604 | 2.0% |
| 7 | 86627 | 1.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 4793526 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2332131 | |
| 1 | 744777 | 15.5% |
| H | 532614 | 11.1% |
| 3 | 338064 | 7.1% |
| 2 | 209978 | 4.4% |
| 5 | 159172 | 3.3% |
| 8 | 115013 | 2.4% |
| 9 | 105322 | 2.2% |
| 4 | 94604 | 2.0% |
| 7 | 86627 | 1.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 4793526 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2332131 | |
| 1 | 744777 | 15.5% |
| H | 532614 | 11.1% |
| 3 | 338064 | 7.1% |
| 2 | 209978 | 4.4% |
| 5 | 159172 | 3.3% |
| 8 | 115013 | 2.4% |
| 9 | 105322 | 2.2% |
| 4 | 94604 | 2.0% |
| 7 | 86627 | 1.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 4793526 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2332131 | |
| 1 | 744777 | 15.5% |
| H | 532614 | 11.1% |
| 3 | 338064 | 7.1% |
| 2 | 209978 | 4.4% |
| 5 | 159172 | 3.3% |
| 8 | 115013 | 2.4% |
| 9 | 105322 | 2.2% |
| 4 | 94604 | 2.0% |
| 7 | 86627 | 1.8% |
hierarchy5_id
Categorical
HIGH CARDINALITY 
| Distinct | 292 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.0 MiB |
| H0001021012 | 10979 |
|---|---|
| H0000040501 | 10941 |
| H0003160922 | 9724 |
| H0002000926 | 7881 |
| H0000040001 | 6976 |
| Other values (287) |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 11 |
| Min length | 11 |
Characters and Unicode
| Total characters | 5858754 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | H0000030901 |
|---|---|
| 2nd row | H0003160922 |
| 3rd row | H0003110017 |
| 4th row | H0003140912 |
| 5th row | H0003110906 |
Common Values
| Value | Count | Frequency (%) |
| H0001021012 | 10979 | 2.1% |
| H0000040501 | 10941 | 2.1% |
| H0003160922 | 9724 | 1.8% |
| H0002000926 | 7881 | 1.5% |
| H0000040001 | 6976 | 1.3% |
| H0106011307 | 6823 | 1.3% |
| H0106011422 | 6376 | 1.2% |
| H0003120012 | 6334 | 1.2% |
| H0000030001 | 6038 | 1.1% |
| H0312110917 | 5896 | 1.1% |
| Other values (282) | 454646 |
Length
| Value | Count | Frequency (%) |
| h0001021012 | 10979 | 2.1% |
| h0000040501 | 10941 | 2.1% |
| h0003160922 | 9724 | 1.8% |
| h0002000926 | 7881 | 1.5% |
| h0000040001 | 6976 | 1.3% |
| h0106011307 | 6823 | 1.3% |
| h0106011422 | 6376 | 1.2% |
| h0003120012 | 6334 | 1.2% |
| h0000030001 | 6038 | 1.1% |
| h0312110917 | 5896 | 1.1% |
| Other values (282) | 454646 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2640566 | |
| 1 | 1022351 | 17.4% |
| H | 532614 | 9.1% |
| 3 | 414634 | 7.1% |
| 2 | 410687 | 7.0% |
| 5 | 176809 | 3.0% |
| 6 | 157237 | 2.7% |
| 7 | 139239 | 2.4% |
| 8 | 127899 | 2.2% |
| 4 | 124501 | 2.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 5858754 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2640566 | |
| 1 | 1022351 | 17.4% |
| H | 532614 | 9.1% |
| 3 | 414634 | 7.1% |
| 2 | 410687 | 7.0% |
| 5 | 176809 | 3.0% |
| 6 | 157237 | 2.7% |
| 7 | 139239 | 2.4% |
| 8 | 127899 | 2.2% |
| 4 | 124501 | 2.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 5858754 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2640566 | |
| 1 | 1022351 | 17.4% |
| H | 532614 | 9.1% |
| 3 | 414634 | 7.1% |
| 2 | 410687 | 7.0% |
| 5 | 176809 | 3.0% |
| 6 | 157237 | 2.7% |
| 7 | 139239 | 2.4% |
| 8 | 127899 | 2.2% |
| 4 | 124501 | 2.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 5858754 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2640566 | |
| 1 | 1022351 | 17.4% |
| H | 532614 | 9.1% |
| 3 | 414634 | 7.1% |
| 2 | 410687 | 7.0% |
| 5 | 176809 | 3.0% |
| 6 | 157237 | 2.7% |
| 7 | 139239 | 2.4% |
| 8 | 127899 | 2.2% |
| 4 | 124501 | 2.1% |
storetype_id
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| ST04 | |
|---|---|
| ST03 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2130456 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | ST03 |
|---|---|
| 2nd row | ST03 |
| 3rd row | ST03 |
| 4th row | ST03 |
| 5th row | ST03 |
Common Values
| Value | Count | Frequency (%) |
| ST04 | 470568 | |
| ST03 | 62046 | 11.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| st04 | 470568 | |
| st03 | 62046 | 11.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 532614 | |
| T | 532614 | |
| 0 | 532614 | |
| 4 | 470568 | |
| 3 | 62046 | 2.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 532614 | |
| T | 532614 | |
| 0 | 532614 | |
| 4 | 470568 | |
| 3 | 62046 | 2.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 532614 | |
| T | 532614 | |
| 0 | 532614 | |
| 4 | 470568 | |
| 3 | 62046 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 532614 | |
| T | 532614 | |
| 0 | 532614 | |
| 4 | 470568 | |
| 3 | 62046 | 2.9% |
store_size
Categorical
HIGH CORRELATION 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| 45 | |
|---|---|
| 31 | |
| 13 |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1065228 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 13 |
|---|---|
| 2nd row | 13 |
| 3rd row | 13 |
| 4th row | 13 |
| 5th row | 13 |
Common Values
| Value | Count | Frequency (%) |
| 45 | 267115 | |
| 31 | 203453 | |
| 13 | 62046 | 11.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 45 | 267115 | |
| 31 | 203453 | |
| 13 | 62046 | 11.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 4 | 267115 | |
| 5 | 267115 | |
| 3 | 265499 | |
| 1 | 265499 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1065228 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 4 | 267115 | |
| 5 | 267115 | |
| 3 | 265499 | |
| 1 | 265499 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1065228 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 4 | 267115 | |
| 5 | 267115 | |
| 3 | 265499 | |
| 1 | 265499 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1065228 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 4 | 267115 | |
| 5 | 267115 | |
| 3 | 265499 | |
| 1 | 265499 |
city_id_old
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| C006 |
|---|
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2130456 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | C006 |
|---|---|
| 2nd row | C006 |
| 3rd row | C006 |
| 4th row | C006 |
| 5th row | C006 |
Common Values
| Value | Count | Frequency (%) |
| C006 | 532614 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| c006 | 532614 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1065228 | |
| C | 532614 | |
| 6 | 532614 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1065228 | |
| C | 532614 | |
| 6 | 532614 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1065228 | |
| C | 532614 | |
| 6 | 532614 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2130456 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1065228 | |
| C | 532614 | |
| 6 | 532614 |
country_id
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| Turkey |
|---|
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 3195684 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Turkey |
|---|---|
| 2nd row | Turkey |
| 3rd row | Turkey |
| 4th row | Turkey |
| 5th row | Turkey |
Common Values
| Value | Count | Frequency (%) |
| Turkey | 532614 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| turkey | 532614 |
Most occurring characters
| Value | Count | Frequency (%) |
| T | 532614 | |
| u | 532614 | |
| r | 532614 | |
| k | 532614 | |
| e | 532614 | |
| y | 532614 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3195684 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| T | 532614 | |
| u | 532614 | |
| r | 532614 | |
| k | 532614 | |
| e | 532614 | |
| y | 532614 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3195684 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| T | 532614 | |
| u | 532614 | |
| r | 532614 | |
| k | 532614 | |
| e | 532614 | |
| y | 532614 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3195684 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| T | 532614 | |
| u | 532614 | |
| r | 532614 | |
| k | 532614 | |
| e | 532614 | |
| y | 532614 |
city_code
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| Konya |
|---|
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Characters and Unicode
| Total characters | 2663070 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Konya |
|---|---|
| 2nd row | Konya |
| 3rd row | Konya |
| 4th row | Konya |
| 5th row | Konya |
Common Values
| Value | Count | Frequency (%) |
| Konya | 532614 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| konya | 532614 |
Most occurring characters
| Value | Count | Frequency (%) |
| K | 532614 | |
| o | 532614 | |
| n | 532614 | |
| y | 532614 | |
| a | 532614 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| K | 532614 | |
| o | 532614 | |
| n | 532614 | |
| y | 532614 | |
| a | 532614 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| K | 532614 | |
| o | 532614 | |
| n | 532614 | |
| y | 532614 | |
| a | 532614 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2663070 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| K | 532614 | |
| o | 532614 | |
| n | 532614 | |
| y | 532614 | |
| a | 532614 |
day
Real number (ℝ)
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.746554 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 16 |
| Q3 | 23 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.7800467 |
|---|---|
| Coefficient of variation (CV) | 0.55758528 |
| Kurtosis | -1.1904873 |
| Mean | 15.746554 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.0046516988 |
| Sum | 8386835 |
| Variance | 77.08922 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20 | 17599 | 3.3% |
| 19 | 17596 | 3.3% |
| 23 | 17595 | 3.3% |
| 14 | 17575 | 3.3% |
| 21 | 17573 | 3.3% |
| 18 | 17572 | 3.3% |
| 24 | 17568 | 3.3% |
| 22 | 17568 | 3.3% |
| 11 | 17566 | 3.3% |
| 16 | 17566 | 3.3% |
| Other values (21) | 356836 |
| Value | Count | Frequency (%) |
| 1 | 17011 | |
| 2 | 17439 | |
| 3 | 17443 | |
| 4 | 17435 | |
| 5 | 17441 | |
| 6 | 17445 | |
| 7 | 17453 | |
| 8 | 17476 | |
| 9 | 17520 | |
| 10 | 17542 |
| Value | Count | Frequency (%) |
| 31 | 10123 | |
| 30 | 16059 | |
| 29 | 16041 | |
| 28 | 17552 | |
| 27 | 17531 | |
| 26 | 17525 | |
| 25 | 17561 | |
| 24 | 17568 | |
| 23 | 17595 | |
| 22 | 17568 |
weekday
Categorical
HIGH CORRELATION 
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| Mon | |
|---|---|
| Sat | |
| Fri | |
| Sun | |
| Thu | |
| Other values (2) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 1597842 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Mon |
|---|---|
| 2nd row | Mon |
| 3rd row | Mon |
| 4th row | Mon |
| 5th row | Mon |
Common Values
| Value | Count | Frequency (%) |
| Mon | 76399 | |
| Sat | 76213 | |
| Fri | 76084 | |
| Sun | 76084 | |
| Thu | 75975 | |
| Wed | 75972 | |
| Tue | 75887 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| mon | 76399 | |
| sat | 76213 | |
| fri | 76084 | |
| sun | 76084 | |
| thu | 75975 | |
| wed | 75972 | |
| tue | 75887 |
Most occurring characters
| Value | Count | Frequency (%) |
| u | 227946 | |
| n | 152483 | |
| S | 152297 | |
| T | 151862 | 9.5% |
| e | 151859 | 9.5% |
| M | 76399 | 4.8% |
| o | 76399 | 4.8% |
| a | 76213 | 4.8% |
| t | 76213 | 4.8% |
| F | 76084 | 4.8% |
| Other values (5) | 380087 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1597842 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| u | 227946 | |
| n | 152483 | |
| S | 152297 | |
| T | 151862 | 9.5% |
| e | 151859 | 9.5% |
| M | 76399 | 4.8% |
| o | 76399 | 4.8% |
| a | 76213 | 4.8% |
| t | 76213 | 4.8% |
| F | 76084 | 4.8% |
| Other values (5) | 380087 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1597842 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| u | 227946 | |
| n | 152483 | |
| S | 152297 | |
| T | 151862 | 9.5% |
| e | 151859 | 9.5% |
| M | 76399 | 4.8% |
| o | 76399 | 4.8% |
| a | 76213 | 4.8% |
| t | 76213 | 4.8% |
| F | 76084 | 4.8% |
| Other values (5) | 380087 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1597842 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| u | 227946 | |
| n | 152483 | |
| S | 152297 | |
| T | 151862 | 9.5% |
| e | 151859 | 9.5% |
| M | 76399 | 4.8% |
| o | 76399 | 4.8% |
| a | 76213 | 4.8% |
| t | 76213 | 4.8% |
| F | 76084 | 4.8% |
| Other values (5) | 380087 |
season
Categorical
HIGH CORRELATION 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| 3 | |
|---|---|
| 2 | |
| 1 | |
| 4 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 532614 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 150328 | |
| 2 | 145810 | |
| 1 | 138489 | |
| 4 | 97987 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3 | 150328 | |
| 2 | 145810 | |
| 1 | 138489 | |
| 4 | 97987 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 150328 | |
| 2 | 145810 | |
| 1 | 138489 | |
| 4 | 97987 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 532614 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 150328 | |
| 2 | 145810 | |
| 1 | 138489 | |
| 4 | 97987 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 532614 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 150328 | |
| 2 | 145810 | |
| 1 | 138489 | |
| 4 | 97987 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 532614 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 150328 | |
| 2 | 145810 | |
| 1 | 138489 | |
| 4 | 97987 |
week
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 53 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25.308542 |
| Minimum | 1 |
|---|---|
| Maximum | 53 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.1 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 13 |
| median | 25 |
| Q3 | 37 |
| 95-th percentile | 49 |
| Maximum | 53 |
| Range | 52 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 14.331339 |
|---|---|
| Coefficient of variation (CV) | 0.56626489 |
| Kurtosis | -1.0633578 |
| Mean | 25.308542 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 0.1050274 |
| Sum | 13479684 |
| Variance | 205.38727 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 30 | 11503 | 2.2% |
| 32 | 11501 | 2.2% |
| 31 | 11493 | 2.2% |
| 38 | 11483 | 2.2% |
| 33 | 11482 | 2.2% |
| 37 | 11474 | 2.2% |
| 39 | 11446 | 2.1% |
| 29 | 11437 | 2.1% |
| 28 | 11422 | 2.1% |
| 35 | 11372 | 2.1% |
| Other values (43) | 418001 |
| Value | Count | Frequency (%) |
| 1 | 7162 | |
| 2 | 10603 | |
| 3 | 10667 | |
| 4 | 10657 | |
| 5 | 10737 | |
| 6 | 10792 | |
| 7 | 10826 | |
| 8 | 10794 | |
| 9 | 10726 | |
| 10 | 10888 |
| Value | Count | Frequency (%) |
| 53 | 3185 | |
| 52 | 7547 | |
| 51 | 7566 | |
| 50 | 7515 | |
| 49 | 7453 | |
| 48 | 7348 | |
| 47 | 7455 | |
| 46 | 7518 | |
| 45 | 7438 | |
| 44 | 7402 |
holiday
Boolean
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 520.3 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) |
| False | 375463 | |
| True | 157151 |
month_name
Categorical
HIGH CORRELATION 
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.1 MiB |
| Jul | |
|---|---|
| Aug | |
| May | |
| Sep | |
| Mar | |
| Other values (7) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 1597842 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Jan |
|---|---|
| 2nd row | Jan |
| 3rd row | Jan |
| 4th row | Jan |
| 5th row | Jan |
Common Values
| Value | Count | Frequency (%) |
| Jul | 50655 | |
| Aug | 50646 | |
| May | 49588 | |
| Sep | 49027 | |
| Mar | 48677 | |
| Apr | 48274 | |
| Jun | 47948 | |
| Jan | 46650 | |
| Feb | 43162 | |
| Dec | 33306 | |
| Other values (2) | 64681 |
Length
| Value | Count | Frequency (%) |
| jul | 50655 | |
| aug | 50646 | |
| may | 49588 | |
| sep | 49027 | |
| mar | 48677 | |
| apr | 48274 | |
| jun | 47948 | |
| jan | 46650 | |
| feb | 43162 | |
| dec | 33306 | |
| Other values (2) | 64681 |
Most occurring characters
| Value | Count | Frequency (%) |
| u | 149249 | 9.3% |
| J | 145253 | 9.1% |
| a | 144915 | 9.1% |
| e | 125495 | 7.9% |
| A | 98920 | 6.2% |
| M | 98265 | 6.1% |
| p | 97301 | 6.1% |
| r | 96951 | 6.1% |
| n | 94598 | 5.9% |
| c | 66120 | 4.1% |
| Other values (12) | 480775 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1597842 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| u | 149249 | 9.3% |
| J | 145253 | 9.1% |
| a | 144915 | 9.1% |
| e | 125495 | 7.9% |
| A | 98920 | 6.2% |
| M | 98265 | 6.1% |
| p | 97301 | 6.1% |
| r | 96951 | 6.1% |
| n | 94598 | 5.9% |
| c | 66120 | 4.1% |
| Other values (12) | 480775 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1597842 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| u | 149249 | 9.3% |
| J | 145253 | 9.1% |
| a | 144915 | 9.1% |
| e | 125495 | 7.9% |
| A | 98920 | 6.2% |
| M | 98265 | 6.1% |
| p | 97301 | 6.1% |
| r | 96951 | 6.1% |
| n | 94598 | 5.9% |
| c | 66120 | 4.1% |
| Other values (12) | 480775 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1597842 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| u | 149249 | 9.3% |
| J | 145253 | 9.1% |
| a | 144915 | 9.1% |
| e | 125495 | 7.9% |
| A | 98920 | 6.2% |
| M | 98265 | 6.1% |
| p | 97301 | 6.1% |
| r | 96951 | 6.1% |
| n | 94598 | 5.9% |
| c | 66120 | 4.1% |
| Other values (12) | 480775 |
| Unnamed: 0 | cluster_id | day | hierarchy1_id | hierarchy2_id | hierarchy3_id | holiday | month_name | price | product_depth | product_length | product_width | promo_bin_1 | promo_bin_2 | promo_discount_2 | promo_discount_type_2 | promo_type_1 | promo_type_2 | revenue | sales | season | stock | store_id | store_size | storetype_id | week | weekday | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Unnamed: 0 | 1.000 | 0.084 | 0.014 | 0.102 | 0.151 | 0.232 | 0.004 | 0.110 | 0.112 | -0.028 | 0.065 | -0.016 | 0.076 | 0.063 | 0.068 | 0.106 | 0.049 | 0.007 | -0.053 | -0.056 | 0.104 | -0.017 | 1.000 | 1.000 | 1.000 | 0.068 | 0.000 |
| cluster_id | 0.084 | 1.000 | 0.002 | 0.248 | 0.306 | 0.535 | 0.001 | 0.016 | -0.264 | 0.079 | -0.014 | -0.103 | 0.142 | 0.492 | 0.389 | 0.483 | 0.064 | 0.018 | 0.112 | 0.117 | 0.026 | 0.126 | 0.100 | 0.100 | 0.128 | 0.006 | 0.000 |
| day | 0.014 | 0.002 | 1.000 | 0.000 | 0.000 | 0.000 | 0.057 | 0.034 | 0.004 | -0.000 | 0.002 | -0.000 | 0.033 | 0.350 | 0.266 | 0.420 | 0.012 | 0.024 | -0.003 | -0.002 | 0.017 | 0.001 | 0.000 | 0.000 | 0.000 | 0.093 | 0.028 |
| hierarchy1_id | 0.102 | 0.248 | 0.000 | 1.000 | 1.000 | 1.000 | 0.000 | 0.014 | 0.466 | 0.175 | 0.040 | 0.160 | 0.177 | 0.461 | 0.522 | 0.642 | 0.101 | 0.011 | -0.212 | -0.225 | 0.011 | -0.223 | 0.123 | 0.123 | 0.136 | 0.008 | 0.000 |
| hierarchy2_id | 0.151 | 0.306 | 0.000 | 1.000 | 1.000 | 1.000 | 0.000 | 0.016 | 0.430 | 0.119 | -0.062 | 0.152 | 0.268 | 0.926 | 0.783 | 0.913 | 0.096 | 0.032 | -0.173 | -0.183 | 0.025 | -0.206 | 0.181 | 0.181 | 0.210 | 0.008 | 0.000 |
| hierarchy3_id | 0.232 | 0.535 | 0.000 | 1.000 | 1.000 | 1.000 | 0.000 | 0.033 | 0.424 | 0.105 | -0.077 | 0.145 | 0.435 | 0.957 | 0.869 | 0.984 | 0.153 | 0.051 | -0.174 | -0.184 | 0.054 | -0.207 | 0.272 | 0.272 | 0.324 | 0.007 | 0.000 |
| holiday | 0.004 | 0.001 | 0.057 | 0.000 | 0.000 | 0.000 | 1.000 | 0.037 | 0.002 | 0.001 | -0.000 | 0.001 | 0.045 | 0.434 | 0.438 | 0.470 | 0.035 | 0.012 | 0.053 | 0.051 | 0.011 | -0.002 | 0.000 | 0.000 | 0.000 | 0.010 | 0.978 |
| month_name | 0.110 | 0.016 | 0.034 | 0.014 | 0.016 | 0.033 | 0.037 | 1.000 | 0.007 | -0.003 | -0.001 | -0.002 | 0.090 | 0.360 | 0.388 | 0.665 | 0.036 | 0.038 | -0.000 | -0.001 | 1.000 | 0.000 | 0.010 | 0.010 | 0.010 | 0.309 | 0.030 |
| price | 0.112 | -0.264 | 0.004 | 0.466 | 0.430 | 0.424 | 0.002 | 0.007 | 1.000 | 0.252 | 0.298 | 0.110 | 0.043 | 1.000 | 1.000 | 1.000 | 0.081 | 0.000 | -0.233 | -0.261 | 0.016 | -0.390 | 0.022 | 0.022 | 0.010 | 0.021 | 0.000 |
| product_depth | -0.028 | 0.079 | -0.000 | 0.175 | 0.119 | 0.105 | 0.001 | -0.003 | 0.252 | 1.000 | 0.193 | 0.167 | 0.092 | 0.251 | 0.227 | 0.265 | 0.040 | 0.005 | -0.008 | -0.019 | 0.021 | -0.116 | 0.062 | 0.062 | 0.062 | 0.001 | 0.000 |
| product_length | 0.065 | -0.014 | 0.002 | 0.040 | -0.062 | -0.077 | -0.000 | -0.001 | 0.298 | 0.193 | 1.000 | 0.150 | 0.070 | 0.108 | 0.261 | 0.113 | 0.031 | 0.002 | -0.069 | -0.081 | 0.024 | -0.194 | 0.054 | 0.054 | 0.053 | 0.002 | 0.000 |
| product_width | -0.016 | -0.103 | -0.000 | 0.160 | 0.152 | 0.145 | 0.001 | -0.002 | 0.110 | 0.167 | 0.150 | 1.000 | 0.066 | 0.321 | 0.451 | 0.238 | 0.038 | 0.007 | -0.009 | -0.012 | 0.019 | -0.059 | 0.078 | 0.078 | 0.077 | -0.003 | 0.000 |
| promo_bin_1 | 0.076 | 0.142 | 0.033 | 0.177 | 0.268 | 0.435 | 0.045 | 0.090 | 0.043 | 0.092 | 0.070 | 0.066 | 1.000 | 0.943 | 0.798 | 0.888 | 0.419 | 0.029 | 0.077 | 0.080 | 0.064 | 0.119 | 0.087 | 0.087 | 0.117 | 0.026 | 0.032 |
| promo_bin_2 | 0.063 | 0.492 | 0.350 | 0.461 | 0.926 | 0.957 | 0.434 | 0.360 | 1.000 | 0.251 | 0.108 | 0.321 | 0.943 | 1.000 | 0.998 | 0.855 | 0.567 | 0.695 | -0.091 | -0.099 | 1.000 | -0.118 | 0.063 | 0.063 | 0.092 | -0.369 | 0.351 |
| promo_discount_2 | 0.068 | 0.389 | 0.266 | 0.522 | 0.783 | 0.869 | 0.438 | 0.388 | 1.000 | 0.227 | 0.261 | 0.451 | 0.798 | 0.998 | 1.000 | 0.727 | 0.400 | 0.703 | 0.098 | 0.102 | 1.000 | 0.044 | 0.068 | 0.068 | 0.107 | 0.416 | 0.259 |
| promo_discount_type_2 | 0.106 | 0.483 | 0.420 | 0.642 | 0.913 | 0.984 | 0.470 | 0.665 | 1.000 | 0.265 | 0.113 | 0.238 | 0.888 | 0.855 | 0.727 | 1.000 | 0.395 | 0.701 | 0.206 | 0.218 | 1.000 | 0.277 | 0.106 | 0.106 | 0.134 | -0.367 | 0.321 |
| promo_type_1 | 0.049 | 0.064 | 0.012 | 0.101 | 0.096 | 0.153 | 0.035 | 0.036 | 0.081 | 0.040 | 0.031 | 0.038 | 0.419 | 0.567 | 0.400 | 0.395 | 1.000 | 0.011 | -0.037 | -0.032 | 0.041 | 0.026 | 0.049 | 0.049 | 0.066 | 0.015 | 0.019 |
| promo_type_2 | 0.007 | 0.018 | 0.024 | 0.011 | 0.032 | 0.051 | 0.012 | 0.038 | 0.000 | 0.005 | 0.002 | 0.007 | 0.029 | 0.695 | 0.703 | 0.701 | 0.011 | 1.000 | -0.004 | -0.004 | 0.028 | -0.003 | 0.001 | 0.001 | 0.000 | -0.025 | 0.009 |
| revenue | -0.053 | 0.112 | -0.003 | -0.212 | -0.173 | -0.174 | 0.053 | -0.000 | -0.233 | -0.008 | -0.069 | -0.009 | 0.077 | -0.091 | 0.098 | 0.206 | -0.037 | -0.004 | 1.000 | 0.995 | 0.000 | 0.197 | 0.000 | 0.000 | 0.000 | 0.006 | 0.002 |
| sales | -0.056 | 0.117 | -0.002 | -0.225 | -0.183 | -0.184 | 0.051 | -0.001 | -0.261 | -0.019 | -0.081 | -0.012 | 0.080 | -0.099 | 0.102 | 0.218 | -0.032 | -0.004 | 0.995 | 1.000 | 0.003 | 0.212 | 0.012 | 0.012 | 0.006 | 0.005 | 0.006 |
| season | 0.104 | 0.026 | 0.017 | 0.011 | 0.025 | 0.054 | 0.011 | 1.000 | 0.016 | 0.021 | 0.024 | 0.019 | 0.064 | 1.000 | 1.000 | 1.000 | 0.041 | 0.028 | 0.000 | 0.003 | 1.000 | -0.003 | 0.008 | 0.008 | 0.009 | 0.966 | 0.007 |
| stock | -0.017 | 0.126 | 0.001 | -0.223 | -0.206 | -0.207 | -0.002 | 0.000 | -0.390 | -0.116 | -0.194 | -0.059 | 0.119 | -0.118 | 0.044 | 0.277 | 0.026 | -0.003 | 0.197 | 0.212 | -0.003 | 1.000 | 0.019 | 0.019 | 0.013 | -0.004 | 0.000 |
| store_id | 1.000 | 0.100 | 0.000 | 0.123 | 0.181 | 0.272 | 0.000 | 0.010 | 0.022 | 0.062 | 0.054 | 0.078 | 0.087 | 0.063 | 0.068 | 0.106 | 0.049 | 0.001 | 0.000 | 0.012 | 0.008 | 0.019 | 1.000 | 1.000 | 1.000 | 0.009 | 0.000 |
| store_size | 1.000 | 0.100 | 0.000 | 0.123 | 0.181 | 0.272 | 0.000 | 0.010 | 0.022 | 0.062 | 0.054 | 0.078 | 0.087 | 0.063 | 0.068 | 0.106 | 0.049 | 0.001 | 0.000 | 0.012 | 0.008 | 0.019 | 1.000 | 1.000 | 1.000 | 0.000 | 0.000 |
| storetype_id | 1.000 | 0.128 | 0.000 | 0.136 | 0.210 | 0.324 | 0.000 | 0.010 | 0.010 | 0.062 | 0.053 | 0.077 | 0.117 | 0.092 | 0.107 | 0.134 | 0.066 | 0.000 | 0.000 | 0.006 | 0.009 | 0.013 | 1.000 | 1.000 | 1.000 | 0.008 | 0.000 |
| week | 0.068 | 0.006 | 0.093 | 0.008 | 0.008 | 0.007 | 0.010 | 0.309 | 0.021 | 0.001 | 0.002 | -0.003 | 0.026 | -0.369 | 0.416 | -0.367 | 0.015 | -0.025 | 0.006 | 0.005 | 0.966 | -0.004 | 0.009 | 0.000 | 0.008 | 1.000 | 0.010 |
| weekday | 0.000 | 0.000 | 0.028 | 0.000 | 0.000 | 0.000 | 0.978 | 0.030 | 0.000 | 0.000 | 0.000 | 0.000 | 0.032 | 0.351 | 0.259 | 0.321 | 0.019 | 0.009 | 0.002 | 0.006 | 0.007 | 0.000 | 0.000 | 0.000 | 0.000 | 0.010 | 1.000 |
| Unnamed: 0 | store_id | product_id | date | sales | revenue | stock | price | promo_type_1 | promo_bin_1 | promo_type_2 | promo_bin_2 | promo_discount_2 | promo_discount_type_2 | product_length | product_depth | product_width | cluster_id | hierarchy1_id | hierarchy2_id | hierarchy3_id | hierarchy4_id | hierarchy5_id | storetype_id | store_size | city_id_old | country_id | city_code | day | weekday | season | week | holiday | month_name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1793963 | S0030 | P0015 | 2017-01-02 | 0.0 | 0.00 | 4.0 | 2.60 | PR14 | NaN | PR03 | NaN | NaN | NaN | 10.0 | 33.0 | 10.0 | cluster_1 | H00 | H0000 | H000003 | H00000309 | H0000030901 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| 1 | 1793964 | S0030 | P0018 | 2017-01-02 | 1.0 | 1.81 | 5.0 | 1.95 | PR14 | NaN | PR03 | NaN | NaN | NaN | 1.0 | 14.0 | 11.0 | cluster_4 | H00 | H0003 | H000316 | H00031609 | H0003160922 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| 2 | 1793965 | S0030 | P0035 | 2017-01-02 | 2.0 | 4.54 | 1.0 | 2.45 | PR14 | NaN | PR03 | NaN | NaN | NaN | 3.0 | 17.0 | 12.5 | cluster_7 | H00 | H0003 | H000311 | H00031100 | H0003110017 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| 3 | 1793966 | S0030 | P0051 | 2017-01-02 | 0.0 | 0.00 | 27.0 | 0.70 | PR14 | NaN | PR03 | NaN | NaN | NaN | 1.7 | 17.5 | 4.5 | cluster_7 | H00 | H0003 | H000314 | H00031409 | H0003140912 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| 4 | 1793967 | S0030 | P0055 | 2017-01-02 | 0.0 | 0.00 | 12.0 | 3.50 | PR05 | verylow | PR03 | NaN | NaN | NaN | 2.3 | 18.5 | 13.5 | cluster_0 | H00 | H0003 | H000311 | H00031109 | H0003110906 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| 5 | 1793968 | S0030 | P0057 | 2017-01-02 | 0.0 | 0.00 | 4.0 | 12.90 | PR14 | NaN | PR03 | NaN | NaN | NaN | 4.0 | 22.0 | 9.0 | cluster_0 | H01 | H0108 | H010807 | H01080709 | H0108070901 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| 6 | 1793969 | S0030 | P0062 | 2017-01-02 | 0.0 | 0.00 | 5.0 | 19.90 | PR14 | NaN | PR03 | NaN | NaN | NaN | NaN | NaN | NaN | cluster_0 | H03 | H0312 | H031205 | H03120507 | H0312050709 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| 7 | 1793970 | S0030 | P0099 | 2017-01-02 | 0.0 | 0.00 | 5.0 | 10.90 | PR14 | NaN | PR03 | NaN | NaN | NaN | 3.0 | 11.3 | 10.0 | cluster_0 | H03 | H0313 | H031302 | H03130210 | H0313021001 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| 8 | 1793971 | S0030 | P0103 | 2017-01-02 | 0.0 | 0.00 | 13.0 | 2.65 | PR14 | NaN | PR03 | NaN | NaN | NaN | 9.0 | 30.0 | 9.0 | cluster_0 | H00 | H0000 | H000003 | H00000300 | H0000030001 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| 9 | 1793972 | S0030 | P0114 | 2017-01-02 | 1.0 | 0.42 | 20.0 | 0.45 | PR14 | NaN | PR03 | NaN | NaN | NaN | 2.0 | 7.5 | 9.0 | cluster_0 | H00 | H0003 | H000312 | H00031209 | H0003120906 | ST03 | 13 | C006 | Turkey | Konya | 2 | Mon | 1 | 1 | N | Jan |
| Unnamed: 0 | store_id | product_id | date | sales | revenue | stock | price | promo_type_1 | promo_bin_1 | promo_type_2 | promo_bin_2 | promo_discount_2 | promo_discount_type_2 | product_length | product_depth | product_width | cluster_id | hierarchy1_id | hierarchy2_id | hierarchy3_id | hierarchy4_id | hierarchy5_id | storetype_id | store_size | city_id_old | country_id | city_code | day | weekday | season | week | holiday | month_name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 532604 | 8519010 | S0142 | P0718 | 2019-09-30 | 0.0 | 0.0 | 29.0 | 23.75 | PR14 | NaN | PR03 | NaN | NaN | NaN | 5.0 | 18.0 | 10.0 | cluster_0 | H00 | H0004 | H000401 | H00040100 | H0004010026 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |
| 532605 | 8519011 | S0142 | P0721 | 2019-09-30 | 0.0 | 0.0 | 6.0 | 14.50 | PR05 | moderate | PR03 | NaN | NaN | NaN | 1.5 | 25.0 | 12.5 | cluster_0 | H01 | H0108 | H010809 | H01080900 | H0108090001 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |
| 532606 | 8519012 | S0142 | P0724 | 2019-09-30 | 0.0 | 0.0 | 12.0 | 7.90 | PR14 | NaN | PR03 | NaN | NaN | NaN | 6.0 | 24.0 | 10.0 | cluster_0 | H00 | H0003 | H000310 | H00031000 | H0003100001 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |
| 532607 | 8519013 | S0142 | P0729 | 2019-09-30 | 0.0 | 0.0 | 2.0 | 69.90 | PR14 | NaN | PR03 | NaN | NaN | NaN | 19.0 | 23.0 | 20.0 | cluster_0 | H03 | H0315 | H031508 | H03150800 | H0315080020 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |
| 532608 | 8519014 | S0142 | P0731 | 2019-09-30 | 0.0 | 0.0 | 18.0 | 9.90 | PR14 | NaN | PR03 | NaN | NaN | NaN | 8.0 | 18.0 | 8.0 | cluster_0 | H03 | H0314 | H031400 | H03140001 | H0314000101 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |
| 532609 | 8519015 | S0142 | P0733 | 2019-09-30 | 0.0 | 0.0 | 12.0 | 0.75 | PR14 | NaN | PR03 | NaN | NaN | NaN | 2.0 | 4.0 | 9.0 | cluster_7 | H00 | H0003 | H000314 | H00031405 | H0003140506 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |
| 532610 | 8519016 | S0142 | P0741 | 2019-09-30 | 0.0 | 0.0 | 3.0 | 32.90 | PR10 | verylow | PR03 | NaN | NaN | NaN | 3.8 | 16.4 | 9.5 | cluster_0 | H01 | H0106 | H010600 | H01060013 | H0106001345 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |
| 532611 | 8519017 | S0142 | P0742 | 2019-09-30 | 0.0 | 0.0 | 5.0 | 69.90 | PR07 | verylow | PR03 | NaN | NaN | NaN | 6.4 | 7.0 | 6.4 | cluster_0 | H01 | H0108 | H010811 | H01081100 | H0108110038 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |
| 532612 | 8519018 | S0142 | P0747 | 2019-09-30 | 0.0 | 0.0 | 16.0 | 21.90 | PR14 | NaN | PR03 | NaN | NaN | NaN | 23.0 | 23.0 | 33.3 | cluster_0 | H01 | H0107 | H010701 | H01070100 | H0107010026 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |
| 532613 | 8519019 | S0142 | P0748 | 2019-09-30 | 0.0 | 0.0 | 18.0 | 18.90 | PR14 | NaN | PR03 | NaN | NaN | NaN | 3.8 | 4.8 | 15.3 | cluster_0 | H01 | H0108 | H010801 | H01080110 | H0108011006 | ST04 | 31 | C006 | Turkey | Konya | 30 | Mon | 3 | 40 | N | Sep |